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Abstract-This paper presents a new efficient implementation 
approach of elliptic curve cryptosystem based on a novel finite 
field multiplication and a high performance scalar multiplication 
algorithm for wireless network authentication. In this new finite 
field multiplication, CLNZ sliding window method is used on the 
signed-digit multiplier in order to reduce the multiplication steps. 
In addition, in scalar multiplication algorithm of the proposed 
implementation approach, point addition and point doubling 
operation compute in parallel. In addition, window technique 
and signed-digit representation are used in order to reduce the 
number of point operation. So the multiplication cost in the 
proposed implementation approach reduced considerably. Using 
this new implementation approach, the security of the elliptic 
curve cryptosystem increased considerably. The results show 
that the proposed finite field multiplication reduces the number 
of multiplication steps at about 40%-82.4% for d=2-10 in 
compare with Montgomery modular multiplication algorithm. In 
addition, the efficiency of the proposed implementation 
approach of elliptic curve cryptosystem enhances about 88%- 
97% in compare with the implementation approach of 
traditional window NAF elliptic curve scalar multiplication 
algorithm and enhances about 77% in compare with the 
implementation approach of window NAF elliptic curve scalar 
multiplication algorithm (based on interleaving) where w=4 and 
8, and d=8. 

Keywords-Wireless Network Security; Authentication; Elliptic 
Curve Cryptosystem; Scalar Multiplication; Finite Field 
Multiplication; Signed-digit Recoding 

I. INTRODUCTION 

In the recent years, wireless networks and mobile devices 
such as cell phone, PDA and smart card, are widely used. So 
the security of wireless networks and mobile devices becomes 
an increasing concern [l]-[3]. In wireless networks, remote 
user authentication in insecure channel is an important issue 
[1]. Due to the limitation in the bandwidth, computational 
strength, power availability or storage in mobile devices, the 
PKC-based remote authentication schemes are not suitable for 
mobile devices [l]-[2]. So, several authentication schemes 
based on elliptic curve cryptography (ECC) are proposed [4], 
[5]-[6], For example, wireless transport layer security (WTLS) 
of wireless application protocol (WAP) using elliptic curve 
Diffi-Hellman (ECDH) authentication algorithm in order to 
authenticate client and server. 

ECC was developed in 1985 independently by Koblitz [7] 
and Miller [8]. The security of the ECC is based on the 
intractability of the elliptic curve discrete logarithm problem 
(ECDLP) [9]. Compared with RSA, ECC offers a better 
performance due to a smaller key size with the same security 
[1],[9]-[10]. 

In ECC-based authentication algorithm, elliptic curve 
scalar multiplication is core operation, but this operation is the 



most time consuming operation. This operation takes 85% of 
executing time [11]. 

Hardware implementation of ECC involves tree-layer 
hierarchical strategy namely finite field arithmetic, point 
arithmetic and scalar multiplication as shown in Fig. 1 [9]. 



Layer 3 



Layer 2 



Layer 1 



Scalar multiplication 



Point addition and point 
doubling 



I 



Multiplication, addition, squaring 
and inverse in finite field 



Fig. 1 . Three-layer Model for Elliptic Curve Scalar Multiplication [9] 

There are many attempts to implement these three layers 
in order to obtain fast computation, reduce power 
consumption and reduce storage space such as double-base 
chain [12], parallel structure [9]-[10], [13]-[14], recoding 
technique[l5]-[l6] and high-radix multiplier [17]-[18]. 

In this paper, a novel finite field multiplication algorithm 
based on constant length nonzero (CLNZ) sliding window 
method, multiple bit scan, multiple bit shift and signed-digit 
technique is presented. This new finite field multiplication 
algorithm is an improvement of the multiplication method in 
[19]. In addition we proposed using this new finite field 
multiplication in order to speeding up the elliptic curve scalar 
multiplication algorithm. So the efficiency and the security of 
the wireless network authentication increased considerably. 

The rest of this paper is organized as follows: section II 
describes the methods of scalar multiplication and the 
adaptive m-ary canonical recoding multiplication method. The 
proposed finite field multiplication algorithm and its 
application in window elliptic curve scalar multiplication 
algorithm are presented in section III. Section IV evaluates 
the proposed algorithms. Finally conclusion is given in 
section V. 

II. BACKGROUND 

A. The Methods of Scalar Multiplication 

The scalar multiplication is achieved by repeated point 
addition (PA) and point doubling (PD) operations. The binary 
method which is shown in Fig. 2, is well known method to 
perform the scalar multiplication of Q = kP [20]. 
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INPUT: 


ke[l,n-l] 


Pe 


GF 


(2"); 


OUTPUT: Q = kP; 








l.Q<- 


0; 








2. For i 


= 0to n-1 








3. if k t 


= 1 then Q<- 


Q + 


P; 




4. P<- 


2P; 








5. Return Q; 









Fig. 2. The Binary Scalar Multiplication Method [20] 

In this method, for n -bit random k , the expected number 

of PA and PD is — and n respectively. The cost of 

2 
multiplication in binary multiplication method depends on the 
Hamming weight and length of the binary representation of k. 
So binary method can be improved by scanning few bits at a 
time as with sliding window method [11] and non adjacent 
form (NAF) of integers [14], [16], [20]. One of the efficient 
methods for reduce the cost of computation is window scalar 
multiplication method [16] which is shown in Fig. 3 

In this algorithm, sliding window method and signed-digit 
representation is used in order to reduce the cost of 
multiplication. In evaluation stage of this algorithm, 
when u f #0, the point Q u is computed in which u satisfying 

m 

a u = u t or a_ u = -u t . Scalar multiplication will cost A 

w + 1 

" h 

in step 4, and in step 5 will cost £ — - — A, where A denotes 

j=l Wj + 1 

PA cost, D denotes PD cost, w denotes window width and 
/ denotes the length of JVAF . 

B. The Adaptive M-ary Canonical Recoding Multiplication 
Method 

The m-ary method and canonical recoding are two well- 
known methods in order to reduce the total number of the 
additions in multiplication of two integers. The m-ary (radix- 
m) method utilizes segmentation and pre-computation in order 
to reduce the number of addition [14], [19], [2l]-[22]. Since in 
binary representation, the probability of a word with d-bit 

zero in sequence is 2~ d , longer words have smaller 
probabilities. For increase efficiency of the probability, Koc 
and Hung in [19] proposed an adaptive m-ary method which 
allows zero words are variable lengths and improve zero word 
probability while using relatively long words in the 
segmentation process. 

According to [19], in computing P = X .Y , we may skip 
additions whenever the corresponding bit of the multiplier is 
zero. So, the binary multiplication algorithm and canonical 
recoding multiplication algorithm require on average 

" and £ addition operation respectively. Koc and Hung in [19] 

2 3 

proposed the combination of the adaptive m-ary segmentation 
algorithm and the canonical recoding algorithm in order to 
obtain the adaptive m-ary segmentation canonical recoding 
multiplication algorithm which is shown in Fig. 4. 



INPUT: w;/ce[l,n-l];PeGF(2"); 






OUTPUT: Q = kP; 






1. Use algorithm 3 (in appendix 


[16]) 


to compute 


p' = k part mod S; 






2. Use algorithm 4 (in appendix 


[16]) 


to compute 


TNAF w (p') = Y 1 ' i :lu i T i ; 






3.Forue(7 = {l,3,5,...,2 w " 1 -l} letQ u 


<— 00 ; 




4. For i = I - 1 to 






4.1. If U, * Othen 






Let U satisfying (3 U = U [ or a = — 


««; 




If U > then Q u <^Q u +P; 






Else Q u 4-Q _„ - P ; 






4.2. P<-tP; 






5. Compute Q<-Q + I„ eI7 u i Q u ; 






6. Return Q ; 







Fig. 3. The Window Scalar Multiplication Method [16] 



Input: X,Y; 


Output: P = X.Y ; 


{Recoding phase} 


1 . Compute D by performed canonical recoding on X ; 


{Pre-computation phase} 


2. Decompose D into X * by using adaptive m-ary method; 


3. Compute and store w.Yfor all canonically recoded d-digit 


numbers of D ; 


{Multiplication phase} 


4. Set ; = and x" =0; 


5. For i = to n — 1 do one of the following: 


6. Case 0. ( X * =0) append X, to X * ; 


7. Case 1. (X* eW and X t is zero ) append X j to X s ; 


8. Case 2. ( X* eW and X ; is nonzero) set j = ] + l 


andX* =X t \ 


9. Case 3. (X* e W, and 1 < lj<d-2) append X, to x] ; 


10. Case 4. (xj e W, and l j= d-\) append X t to X* ; Set 


j = j + 1 and x " =0; 


11. Set k = j + l andP = 0; 


12. For j = k - 1 to 


13. Compute p = 2''P + X*.Y ; 


14. Return ( P ); 



Fig. 4. The Adaptive M-ary Segmentation Canonical Recoding 
Multiplication Algorithm [19] 

In this algorithm, / denotes the length of X * • The 

probability of the zero bits is also increased by using 
canonical recoding technique and the total computation time 
is reduced by using m-ary segmentation. 

III. THE PROPOSED IMPLEMENTATION APPROACH OF ELLIPTIC 
CURVE CRYPTOSYSTEM 

In this section, we proposed a new efficient 
implementation approach of elliptic curve cryptosystem based 
on parallel structure and new efficient finite field 
multiplication for layer 3 and 1 of the Fig. 1 respectively. 
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A. Scalar Multiplication 

The basic operations in all scalar multiplication algorithms 
are point addition and point doubling over an elliptic curve. 
By parallel execution of these point operations, the speed and 
the security of the cryptosystem increased considerably. So 
we use window scalar multiplication algorithm in [16] in order 
to parallel execution of these point operations. But we 
proposed an efficient finite field multiplication algorithm in 
order to speed up this scalar multiplication algorithm. 

B. The Finite Field Arithmetic 

Modular multiplication is major finite field arithmetic. So 
the performance of ECC depends heavily on the efficiency of 
the modular multiplication in finite field [9]-[l0]. 

In finite field multiplication, partial result shifts one bit 
per iteration. Also multiplication by zero bit results in zero, 
but this multiplication by zero is performed and implemented 
in the each iteration. In this paper, we proposed a new 
modified Montgomery modular multiplication by recoding the 
multiplier and then partitioning this signed-digit multiplier 
and performed multiplication by zero partition with any 
length in only one-cycle instead of several cycles. The 
proposed modified Montgomery modular multiplication (M4) 
algorithm is shown in Fig. 5 : 

In this algorithm / is the length (i.e. number of bits) of ith 
partition, # 11(D) is the number of partitioning in the 
multiplier and V, is the corresponding partition value. 

In recoding phase of this new algorithm, the canonical 
recoding is performed on the multiplier. In partitioning phase, 
the CLNZ partitioning is performed on the signed-digit 
multiplier, so the number of zero partitions is as large as 
possible, thus the number of multiplication steps is reduced 
considerably. In this algorithm, the CLNZ partitioning method 
scans the multiplier from the least significant digit to the most 
significant digit according to the algorithm which is shown in 
Fig. 6. Using this strategy, zero windows are allowed to have 
an arbitrary length, but the maximum length of nonzero 
windows should be the exact value of d digit. 

For example, for X = (01 1 1 1 1 1 1001 1 1 1 1 1 101) 2 , the 

canonical recoding of X is D = (01000000101000000101) and 
for d = 3 , the window formed will be 
n(D) = (001), (000000), (101), (000000), (101) . 

In pre-computation phase of the proposed M4 algorithm, 

the least significant digit of nonzero partition is either 1 or 1 , 
which implies that the nonzero partition value is always an 
odd number. So we only require pre-computation of VjY for 

odd number of V, . 

In this new M4 algorithm, pre-computation phase and 
partition phase is performed independently in parallel which 
is speeding up modular multiplication. 

The multiplication phase of M4 algorithm is performed 
w times. Recall that; w denote the number of partitioning in 
the signed-digit multiplier. In each iteration of the 
multiplication phase of M4 algorithm, / ; bits of multiplier and 
n-bit multiplicand are processed. 



Input: X,Y,M; 

Output: P:=X.Y.2"modM; 

1. P = 0; 

{Recoding phase} 

2. Compute D by performed canonical recoding on X ; 
Parallel begin 

{Partitioning phase} 

3. Building Yl(D) using the given strategy; 

4. LetW=#n(D); 
{Pre-computation phase} 

5. Compute and store V f Y; 

Parallel end 

{Multiplication phase} 

6. For / = to w - 1 

7. P:=P+VY; 

8. m:=P M' mod 2 h ; 

9. P:=(P + m.M)/2'' ; 

10. If ( P > M ) then P=P-M ; 

11. Return (P); 



Fig. 5. The New Modified Montgomery Modular Multiplication (M4) 
Algorithm 



Input: D,d; 
Output: n(D) ; 

ZP: check the incoming single digit; 
If it is zero then stay in ZP 
Else go to NZP; 
NZP: stay in NZP until all d digits are collected; 
Check the incoming single digit; 
If it is zero then go to ZP 
Else go to NZP; 
Return (IT (D) ) 



Fig. 6. Partitioning Strategy 
IV. EVALUATION 

A. Evaluation of M4 Algorithm 

In the proposed modified Montgomery modular 
multiplication (M4) algorithm, we use signed-digit recoding 
technique and CLNZ sliding window method. So the 

3n 



Hamming weight of multiplier is - 



3d +4 
algorithm reduces the multiplication steps at about: 



Thus at least, M4 



1- 



6n 
3d +4 



1- 



3d +4 



(1) 



Table I shows the multiplication step reduction of M4 
algorithm in compare with Montgomery modular 
multiplication algorithm [23] for various d. 

Table I 

MULTIPLICATION STEP IMPROVEMENT OF M4 ALGORITHM OVER 
MONTGOMERY MODULAR MULTIPLICATION 



d= 


Multiplication step 
reduction (%) 


d= 


Multiplication step 
reduction (%) 


2 


40% 


7 


76% 


3 


53.8% 


8 


78.6% 


4 


62.5% 


9 


80.6% 
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5 


68.4% 


10 


82.4% 


6 


72.7% 







As it is shown in table I the M4 algorithm reduces the 
multiplication steps at about 40%-82.4% for d=2-10. 

B. Evaluation of the Proposed Implementation Approach 

According to the computational analysis of [16], the 
implementation approach of the traditional window NAF 
(TWN) scalar multiplication algorithm will cost: 



D + (2 W -1)A + 



m 



w + 1 



A + mD 



(2) 



In addition, the implementation approach of the window 
scalar multiplication algorithm based on interleaving (IWN) 
will cost: 



m 



w + 1 



s; 



(< 



W; 



1 



(3) 



The cost of point addition (PA) and point double (PD) 
operation equals at affine coordinate. But the cost of PA 
operation is twice as much as the one of PD operation at 
projective coordinate [16]. The proposed implementation 
approach of elliptic curve cryptosystem using combination of 
M4 algorithm and IWN approach, so proposed 
implementation approach will cost as IWN approach but the 
cost of point addition in the proposed implementation 
approach is reduced considerably. 

Assume digit length is m=160 bit. We compare the cost of 
the proposed implementation approach with the TWN 
approach and IWN approach by analyzing formula (2) and (3). 
Fig. 7- 9 show these comparisons for various window width w 
at affine coordinate and projective coordinate for d=2, 4, 6, 8 
and 10. 

As it is shown in Fig. 7, Fig. 8 and Fig. 9, the proposed 
implementation approach is more efficient than TWN 
approach and IWN approach where window width ( w ) and 
radix (d ) are from 2 to 10. When window width w = 4 and 
w = 8 and d = 8 , we will show comparison of the cost 
among TWN and IWN approach and the proposed 
implementation approach by table II and table III where 
digital length is 160, 192 and 224. 



IWN approach 




-B- Proposed approach 1 


rd=2 


~~Q- Proposed approach 1 


rd=4 


- ♦— Proposed approach 1 


rd=6 


H- Proposed approach! 


rd=8 


~*~ Proposed approach 1 


rd=10 




window width 

Fig. 7. Comparison of the Cost between the Proposed Implementation 
Approach and IWN Approach at Affine Coordinate and Projective 
Coordinates 



TWN approach 

- H- Proposed approach for d-2 
— *— Proposed approach for d=4 

~~ *~ Proposed approach for d=6 
- 4— Proposed approach for d=8 
~~ *~~ Proposed approach for d= 10 



/- 



window width 

Fig. 8. Comparison of the Cost between the Proposed Implementation 
Approach and TWN Approach at Affine Coordinate 



- TWN approach 

- Proposed approach for d=2 

- Proposed approach for d=4 

- Proposed approach for d=6 

- Proposed approach for d=8 
"Proposed approach for d=10 



window width 

Fig. 9. Comparison of the Cost between Proposed Implementation 
Approach and TWN Approach at Projective Coordinate 

Table II 

COMPARISON OF THE COST AT AFFINE COORDINATE 



Digit 
length 


Implementation approach 


w= 


4 


8 


160 


IWN 


64 


35.56 


TWN 


196 


241.78 


Proposed in this paper 


13.7 


7.61 


192 


IWN 


76.8 


42.67 


TWN 


234.4 


277.33 


Proposed in this paper 


16.44 


9.13 


224 


IWN 


89.6 


49.78 


TWN 


272.8 


312.89 


Proposed in this paper 


19.17 


10.65 


Table III 
COMPARISON OF THE COST AT PROJECTIVE COORDINATE 


Digit 
length 


Implementation approach 


w= 


4 


8 


160 


IWN 


64 


35.56 


TWN 


115.5 


161.28 


Proposed in this paper 


13.7 


7.61 


192 


IWN 


76.8 


42.67 


TWN 


137.9 


180.83 


Proposed in this paper 


16.44 


9.13 


224 


IWN 


89.6 


49.78 


TWN 


160.3 


200.39 


Proposed in this paper 


19.17 


10.65 



As it is shown in table II and table III, the efficiency of the 
proposed implementation approach enhances about 77% in 
compare with the IWN approach and enhances about 88%- 



CISME Vol.1 No.3 2011 PP.21-25 www.jcisme.org ©2011 World Academic Publishing 



■24- 



Communications in Information Science and management engineering 



citme 



97% in compare with the TWN approach where w = 4 and 8, 
and d = 8 . 

C. Security Analysis 

In this proposed implementation approach of elliptic curve 
cryptosystem, PD and PA operation is performed in parallel. 
In addition, sliding window and recoding technique are used 
in the scalar multiplication algorithm. So according to [24] the 
proposed implementation approach of elliptic curve 
cryptosystem is standing against timing analysis attacks and 
simple power analysis (SPA) attacks. In addition, in M4 
algorithm, recoding technique and CLNZ sliding window 
method are used. So exploitation of the key information by 
measurement of the currents flowing thought each component 
of the cryptography device is hard. Therefore, cryptosystem 
which use of the proposed implementation approach is 
standing against electromagnetic analysis (EMA) attacks. 

V. CONCLUSION 

In this paper, a novel finite field multiplication algorithm 
based on signed-digit representation, and CLNZ sliding 
window method is presented. In this new finite field 
multiplication, signed-digit is used in order to increase 
probability of the zero bits and CLNZ sliding window method 
is used in order to reduce the multiplication steps. In addition, 
we proposed a new efficient implementation approach of 
elliptic curve cryptosystem for wireless network 
authentication based on this new finite field multiplication 
algorithm. In this new implementation approach, PA and PD 
operations compute in parallel and both window technique 
and recoding technique are used in order to reduce the 
computation cost. Using the proposed implementation 
approach, the security of the cryptosystem increased 
considerably. 

The results show that the number of multiplication steps in 
the M4 algorithm is reduced at about 40%-82.4% for d=2-10 
in compare with Montgomery modular multiplication 
algorithm. Also the efficiency of the proposed implementation 
approach enhances at about 77% and 88%-97% in compare 
with the IWN approach and TWN approach respectively 
where w=4 and 8, and d=8. 
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